Hematology
Few-shot Cross-country Generalization of Tabular Machine Learning and Foundation Models for Childhood Anemia Prediction under Distribution Shift
Brima, Yusuf, Atemkeng, Marcellin, Kallon, Lansana Hassim, Niyukuri, David, Vacavant, Antoine, Saidu, Samuel, Chen, Ding-Geng
Background Childhood Anemia affects an estimated 40% of children aged 6-59 months globally and arises from heterogeneous nutritional, infectious, and socioeconomic factors that vary substantially across settings. This variability challenges the generalizability of predictive machine learning models, which often degrade under cross-population or temporal shifts. We investigated the utility a modern transformer-based tabular foundation model (TabPFN) as a complementatry framework with respect to supervised classical machine learning methods across diverse country contexts, with particular attention to data-scarce settings where surveillance capacity is most limited. Methods We conducted a multi-country prediction study using Demographic and Health Surveys (DHS) children's recode data from 16 countries spanning Africa, Asia, Latin America, the Caucasus, and the Middle East. The harmonized analytic cohort comprised of (n = 68,856)children aged 6-59 months with valid hemoglobin measurements. Anemia was defined using WHO age and altitude-adjusted thresholds and treated as a binary outcome. We trained Logistic Regression, XGBoost, and LightGBM models using standard supervised learning, and evaluated TabPFN v2.6 in an in-context learning setting. Performance was assessed using Area Under the Receiver Operating Characteristic Curve (AUC-ROC) and other standard classification metrics, with calibration evaluated via Brier score and expected calibration error (ECE). Uncertainty in performance estimates was quantified using bootstrap resampling to derive 95% confidence intervals. Robustness was assessed in a few-shot learning setting. Cross-population generalization was examined using leave-one-country-out (LOCO) validation and reverse-LOCO experiments to assess directional transferability. Subgroup analyses were conducted across five demographic strata: child age group, sex, maternal education, residence type, and household wealth quintile. Feature importance was assessed using standard linear and tree-based explainer SHAP values for the three supervised models and an adapted version of SHAP for TabPFN, aggregated across countries and examined at the country level. TabPFN also yielded the best probabilistic calibration across all 16 countries, achieving the lowest mean Brier score (0.203) and Expected Calibration Error (ECE = 0.042) of all models evaluated; LightGBM and Logistic Regression exhibited the greatest miscalibration, particularly at higher predicted probabilities. Under full-data conditions, within-country discrimination was moderate across all models (AUC-ROC 0.59-0.76) Under LOCO validation, performance declined modestly (AUC-ROC 0.58-0.69) Reverse-LOCO analyses revealed asymmetric and directional transferability, with epidemiologically diverse populations serving as more informative training sources and certain target populations remaining persistently difficult to predict regardless of model or training data.
Semi-Parametric Bayesian Additive Regression Trees for Risk Prediction with High-Dimensional Epigenetic Signatures and Low-Dimensional Covariates
Bhandari, Saurabh, Bhatti, Parveen, Chiu, Brian C. -H., Ji, Yuan
In the era of precision medicine, genome-wide epigenetic modifications offer rich data that could inform risk prediction. However, these data are high-dimensional and exhibit complex dependence structures, which makes it difficult to jointly model them with low-dimensional covariates when the goal is to obtain interpretable effect estimates for covariate adjustment. Standard Bayesian additive regression trees (BART) provide strong predictive performance but treat all predictors uniformly within the tree ensemble, obscuring the contributions of significant covariates and complicating variable selection in high-dimensional settings. We propose a semi-parametric BART model (spBART) that addresses this limitation by modeling low-dimensional covariates through a parametric component with interpretable coefficients, while capturing complex nonlinear associations among high-dimensional predictors through the tree ensemble. To perform stable variable selection, we develop a cross-validation-based procedure that aggregates posterior inclusion probabilities across folds and applies Bayesian false discovery rate control. We apply the proposed method to a pooled case--control analysis of high-dimensional genome-wide 5-hydroxymethylcytosine profiles derived from circulating cell-free DNA in two multiple myeloma studies ($N = 869$). The approach identifies a parsimonious set of candidate loci and achieves strong out-of-sample discrimination (AUC $= 0.96$) in a held-out validation set. Overall, spBART provides a unified framework for combining interpretable covariate inference with flexible modeling and variable selection in high-dimensional biomedical studies.
SurvivalPFN: Amortizing Survival Prediction via In-Context Bayesian Inference
Qi, Shi-ang, Balazadeh, Vahid, Cooper, Michael, Greiner, Russell, Krishnan, Rahul G.
Survival analysis provides a powerful statistical framework for modeling time-to-event outcomes in the presence of censoring. However, selecting an appropriate estimator from the many specialized survival approaches often requires substantial methodological and domain expertise. We introduce SurvivalPFN, a prior-data fitted network that amortizes Bayesian inference for censored observations through in-context learning. SurvivalPFN is pretrained on a diverse family of synthetic, identifiable, and right-censored data-generating processes, enabling it to amortize survival analysis in a single forward pass during inference. As a result, the model adapts to the effective complexity of each dataset without task-specific training or hyperparameter tuning, avoids restrictive parametric assumptions, and produces calibrated survival distributions. In a large-scale benchmark spanning 61 datasets, 21 methods, and 5 evaluation metrics, SurvivalPFN achieves strong predictive performance and often improves upon established survival models. These results suggest that SurvivalPFN offers a principled and practical foundation model for survival analysis, with potential applications in high-impact domains such as healthcare, finance, and engineering (https://github.com/rgklab/SurvivalPFN).
TabPFN-3: Technical Report
Grinsztajn, Léo, Flöge, Klemens, Key, Oscar, Birkel, Felix, Jund, Philipp, Roof, Brendan, Manium, Mihir, Bin, Shi, Hoo, null, Bühler, Magnus, Garg, Anurag, Safaric, Dominik, Robertson, Jake, Jäger, Benjamin, Alessi, Simone, Hayler, Adrian, Moroshan, Vladyslav, Purucker, Lennart, Singer, Philipp, Arazi, Alan, Siems, Julien, Metzen, Jan Hendrik, Grab, Georg, Erickson, Nick, Guo, Siyuan, Kalfon, Eliott, Bing, Simon, Salinas, David, Cornu, Clara, Wehrhahn, Lilly Charlotte, Kriuchkova, Diana, Kaya, Kursat, Sidhoum, Lydia, Salmon, Marie, Chen, Jerry, Hulsebos, Madelon, LeCun, Yann, Müller, Samuel, Schölkopf, Bernhard, Gambhir, Sauraj, Hollmann, Noah, Hutter, Frank
Tabular data underpins most high-value prediction problems in science and industry, and TabPFN has driven the foundation model revolution for this modality. Designed with feedback from our users, TabPFN-3 builds on this foundation to scale state-of-the-art performance to datasets with 1M training rows and substantially reduce training and inference time. Pretrained exclusively on synthetic data from our prior, TabPFN-3 dramatically pushes the frontier of tabular prediction and brings substantial gains on time series, relational, and tabular-text data. On the standard tabular benchmark TabArena, a forward pass of TabPFN-3 outperforms all other models, including tuned and ensembled baselines, by a significant margin, and pareto-dominates the speed/performance frontier. On more diverse datasets, TabPFN-3 ranks first on datasets with many classes, and beats 8-hour-tuned gradient-boosted-tree baselines on datasets up to 1M training rows and 200 features. TabPFN-3 introduces test-time compute scaling to tabular foundation models. Our API offering TabPFN-3-Plus (Thinking) exploits this to beat all non-TabPFN models by over 200 Elo on TabArena, rising to 420 Elo on the largest data subset, and outperforms AutoGluon 1.5 extreme while being 10x faster, without using LLMs, real data, internet search or any other model besides TabPFN. TabPFN-3 extends the capabilities of our models, enabling SOTA prediction on relational data (new SOTA foundation model on RelBenchV1) and tabular-text data (SOTA on TabSTAR via TabPFN-3-Plus); and improves existing integrations: a specialized checkpoint, TabPFN-TS-3, ranks 2nd on the time-series benchmark fev-bench, and SHAP-value computation is up to 120x faster. TabPFN-3 achieves this performance while being up to 20x faster than TabPFN-2.5. In addition, a reduced KV cache and row-chunking scale to 1M rows on one H100 with fast inference speed.
Supplementary Material Responsibility Statement
Hyponatremia: Predict whether a hyponatremia lab comes back as normal (>=135 mmol/L), mild (>=130 and <135 mmol/L), moderate (>=125 and <130 mmol/L), or severe (<125 mmol/L). We consider all lab results coded as LOINC/LG11363-5, LOINC/2951-2, or LOINC/2947-0. Anemia: Predict whether an anemia lab comes back as normal (>=120 g/L), mild (>=110 and <120 g/L), moderate (>=70 and <110 g/L), or severe (<70 g/L). We consider all lab results coded as LOINC/LP392452-1. Please note that for the results of our baseline experiments in Section 5, we reframe these lab value tasks as binary classification tasks, where a label is "negative" if the result is normal and "positive" otherwise.
Profile Graphical Models
Avalos-Pacheco, Alejandra, Lupparelli, Monia, Stingo, Francesco C.
We introduce a novel class of graphical models, termed profile graphical models, that represent, within a single graph, how an external factor influences the dependence structure of a multivariate set of variables. This class is quite general and includes multiple graphs and chain graphs as special cases. Profile graphical models capture the conditional distributions of a multivariate random vector given different levels of a risk factor, and learn how the conditional independence structure among variables may vary across these risk profiles; we formally define this family of models and establish their corresponding Markov properties. We derive key structural and probabilistic properties that underpin a more powerful inferential framework than existing approaches, underscoring that our contribution extends beyond a novel graphical representation.Furthermore, we show that the resulting profile undirected graphical models are independence-compatible with two-block LWF chain graph models.We then develop a Bayesian approach for Gaussian undirected profile graphical models based on continuous spike-and-slab priors to learn shared sparsity structures across different levels of the risk factor. We also design a fast EM algorithm for efficient inference. Inferential properties are explored through simulation studies, including the comparison with competing methods. The practical utility of this class of models is demonstrated through the analysis of protein network data from various subtypes of acute myeloid leukemia. Our results show a more parsimonious network and greater patient heterogeneity than its competitors, highlighting its enhanced ability to capture subject-specific differences.
Identification of physiological shock in intensive care units via Bayesian regime switching models
Kendall, Emmett B., Williams, Jonathan P., Storlie, Curtis B., Radosevich, Misty A., Wittwer, Erica D., Warner, Matthew A.
Detection of occult hemorrhage (i.e., internal bleeding) in patients in intensive care units (ICUs) can pose significant challenges for critical care workers. Because blood loss may not always be clinically apparent, clinicians rely on monitoring vital signs for specific trends indicative of a hemorrhage event. The inherent difficulties of diagnosing such an event can lead to late intervention by clinicians which has catastrophic consequences. Therefore, a methodology for early detection of hemorrhage has wide utility. We develop a Bayesian regime switching model (RSM) that analyzes trends in patients' vitals and labs to provide a probabilistic assessment of the underlying physiological state that a patient is in at any given time. This article is motivated by a comprehensive dataset we curated from Mayo Clinic of 33,924 real ICU patient encounters. Longitudinal response measurements are modeled as a vector autoregressive process conditional on all latent states up to the current time point, and the latent states follow a Markov process. We present a novel Bayesian sampling routine to learn the posterior probability distribution of the latent physiological states, as well as develop an approach to account for pre-ICU-admission physiological changes. A simulation and real case study illustrate the effectiveness of our approach.
Japan Approves the World's First Treatment Made With Reprogrammed Human Cells
Japan Approves the World's First Treatment Made With Reprogrammed Human Cells Researchers in Japan pioneered reprogrammed cells 20 years ago. Now the country has given the first-ever authorizations to manufacture and sell medical products based on the technology. Human iPS cell colony established from fibroblasts. Its actual width is approximately 0.5 mm. On March 6, Japan's Ministry of Health, Labor and Welfare officially granted conditional and time-limited marketing authorization to two regenerative medical products derived from reprogrammed iPS cells, marking exactly 20 years since the creation of mouse iPS cells .